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Abstract 

The paper discusses how compositional 
semantics is implemented in the Verb- 
mobil speech-to-speech translation sys- 
tem using LUD, a description language 
for underspecified discourse representa- 
tion structures. The description lan- 
guage and its formal interpretation in 
DRT are described as well as its imple- 
mentation together with the architecture 
of the system's entire syntactic-semantic 
processing module. We show that a lin- 
guistically sound theory and formalism 
can be properly implemented in a sys- 
tem with (near) real-time requirements. 

1 Introduction 

Contemporary syntactic theories are normally 
unification-based and commonly aim at specifying 
as much as possible of the peculiarities of specific 
language constructions in the lexicon rather than 
in the "traditional" grammar rules. When doing 
semantic interpretation within such a framework, 
we want a formalism which allows for 

• compositionality, 

• monotonicity, and 

• underspecification. 

Compositionality may be defined rather strictly 
so that the interpretation of a phrase always 
should be the (logical) sum of the interpretations 
of its subphrases. A semantic formalism being 
compositional in this strict sense would also triv- 
ially be monotonic, since no destructive changes 
would need to be undertaken while building the 



interpretation of a phrase from those of its sub- 
phrases .Q 

However, compositionality is more commonly 
defined in a wider sense, allowing for other 
mappings from subphrase-to-phrase interpreta- 
tion than the sum, as long as the mappings are 
such that the interpretation of the phrase still is a 
function of the interpretations of the subphrases. 
A common such mapping is to let the interpre- 
tation of the phrase be the interpretation of its 
(semantic) head modified by the interpretations 
of the adjuncts. If this modification is done by 
proper unification, the monotonicity of the for- 
malism will still be guaranteed. 

In many applications for Computational Lin- 
guistics, for example when doing semantically 
based translation — as in Verbmobil, the German 
national spoken language translation project de- 
scribed in Section ^ — a complete interpretation 
of an utterance is not always needed or even desir- 
able. Instead of trying to resolve ambiguities, for 
example the ones introduced by different possible 
scopings of quantifiers, the interpretation of the 
ambiguous part is left unresolved. The semantic 
formalism of such a system should thus allow for 
the underspecification of these unresolved ambi- 
guities (but still allow for them to be resolved in 
a monotonic way, of course). An underspecified 
form representing an utterance is then the rep- 
resentation of a set of meanings, all the possible 
interpretations of the utterance. 

The rest of the paper is structured as follows. 
Section ^ gives an overview of the Verbmobil 
Project. Section ^ introduces LUD (description 
Language for Underspecified Discourse represen- 
tations), the semantic formalism we use. Section ^ 
compares our approach to that of others for simi- 
lar tasks. The actual implementation is described 
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^More formally, a semantic representation is mono- 
tonic iff the interpretation of a category on the right 
side of a rule subsumes the interpretation of the left 
side of the rule. 



in Section |[ which also discusses coverage and 
points to some areas of further research. FinaUy, 
Section ^ sums up the previous discussion. 

2 The Verbmobil Project 

The project Verbmobil funded by the German 
Federal Ministry of Research and Technology 
(BMBF) combines speech technology with ma- 
chine translation techniques in order to develop 
a system for translation in face-to- face dialogues. 



The overall project is described in ( Wahlster 



1993); in this section we will give a short overview 
of the key aspects. 

The ambitious overall objective of the Verb- 
mobil project is to produce a device which will 
provide English translations of dialogues between 
German and Japanese businessmen who only have 
a restricted active, but larger passive knowledge of 
English. The domain is the scheduling of business 
appointments. The major requirement is to pro- 
vide translations as and when users need them, 
and do so robustly and in (near) real-time. 

In order to achieve this, the system is composed 
of time-limited processing components which on 
the source language (German or Japanese) side 
perform speech recognition, syntactic, semantic 
and pragmatic analysis, as well as dialogue man- 
agement; transfer on a semantic level; and on 
the target language (English) side generation and 
speech synthesis. When the users speak English, 
only keyword spotting for the dialogue manage- 
ment is undertaken. 

At any moment in the dialogue, a user may 
activate the Verbmobil device and start speak- 
ing his/her native language. The speech recog- 
nition component then processes the input and 
produces a word lattice representing the speech 
hypotheses and their corresponding prosodic in- 
formation. The parsing component processes the 
lattice and assigns each well-formed path through 
it one or several syntactic and (compositional) se- 
mantic representations. Ambiguities introduced 
by these may be resolved by a resolution compo- 
nent. The representations produced are then as- 
signed dialogue acts and used to update the model 
of the discourse, which in turn may be used by the 
speech recognizer to choose the current language 
model. The transfer component takes the (possi- 
bly resolved) semantic analysis of the input and 
builds a target language representation. The gen- 
erator then constructs the corresponding English 
expression. For robustness, this deep-level pro- 
cessing strategy is complemented with a shallow 
analysis- and-transfer component . 



3 Underspecified Representations 

3.1 Theoretical Background 

Since the Verbmobil domain is related to dis- 
course rather than isolated sentences, a variant 
of Kamp's Discourse Representation Theory, DRT 
(Kamp and Reyle, 1993) has been chosen as the 
model theoretic semantics. However, to allow for 
underspecification of several linguistic phenom- 
ena, we have chosen a formalism that is suited 
to represent underspecified structures: LUD, a 
description language for underspecified discourse 
representations ( Bos, 1995 ). The basic idea is the 
one given in Section |l|, namely that natural lan- 
guage expressions are not directly translated into 
Discourse Representation Structures (DRSs), but 
into a representation that describes several DRSs. 

Representations in LUD have the following dis- 
tinct features. Firstly, all elementary seman- 
tic "bits" (conditions, entities, and events) are 
uniquely labeled. This makes them easy to refer 
to and results in a very powerful description lan- 
guage. Secondly, meta variables over DRSs (which 
we call holes) allow for the assignment of under- 
specified scope to a semantic operator. Thirdly, 
a subordination relation on the set of holes and 
labels constrains the number of interpretations of 
the LUD-representation in the object language: 
DRSs. 

3.2 LUD-Representations 

A LUD-representation J7 is a triple 

< Hu,Lu, Cij > 

where Hjj is a set of holes (variables over labels), 
Lu is a set of labeled (LUD) conditions, and Cu 
is a set of constraints. A plugging is a bijective 
function from holes to labels. For each plugging 
there is a corresponding DRS. The syntax of LUD- 
conditions is formally defined as follows: 

1. If x is a discourse marker (i.e., en- 
tity or event), then dm{x) is a LUD- 
condition; 

2. If i? is a symbol for an n-place rela- 
tion, xi, . . . ,Xn are discourse mark- 
ers, then pred{R,xi, . . . ,Xn) is a 
LUD-condition; 

3. If ^ is a label or hole for a 
LUD-condition, then -^l is a LUD- 
condition; 

4. If li and I2 are labels (or holes) for 
LUD-conditions, then li I2, AZ2 
and li V h are LUD-conditions; 

5. Nothing else is a LUD-condition. 



There are three types of constraints in LUD- 
representations. There is subordination (<), strict 
subordination (<), and finally presupposition (a). 
These constraints are syntactically defined as: 

nil, h are labels, his a hole, then li < h, 
h < I2 and li a I2 are LUD-constraints. 

The interpretation of a LUD-representation is 
the interpretation of top, the label or hole of a 
LUD-representation for which there exists no label 
that subordinates it.^ 

The interpretation function / is a function from 
a labeled condition to a DRS. This function is de- 
fined with respect to a plugging P. We represent a 
DRS as a box D | C , where D is the set of dis- 
course markers and C is the set of conditions. The 
mappings between LUD-conditions and DRSs are 
then defined in (||)-(|9|) where I is a label or hole 
and (i is a labeled condition. 
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In (||) is the merge operation, that takes two 
DRSs Ki and K2 and returns a DRS which do- 
main is the union of the set of the domains of Ki 
and K2, and which conditions form the union of 
the set of the conditions of A'l and K2. 



^The reader interested in a more detailed discus- 
sion of the interpretation of underspeci fied semantic 



3.3 Lexical Entries and Composition 

For building LUD-representations we use a 
lambda-operator and functional application in 
order to compositionally combine simple LUD- 
representations to complex ones. In addition, we 
have two functions that help us to keep track 
of the right labels. These are top, as described 
above, and main, the label of the semantic head of 
a LUD-representation. Further, we have an opera- 
tion that combines two LUD-representations into 
one: (merge for LUD-representations). Some 
sample lexical entries for German 

4 Related Work 

The LUD representation is quit e closely rela ted to 
UDRSs, underspecified DRSs ( |Rcylc, 1993| ). The 
main difference is that the LUD description lan- 
guage in principle is independent of the object 
language, thus not only DRT, but also ordinary 
predicate logic, as well as a Dynamic Predicate 
Logic ( procnendijk and Stokhof, 1991 ) can be 
used as the object language of LUD, as shown 
in ( |Bos, 1995| ). Compared to UDRS, LUD also 
has a stronger descriptive power: Not DRSs, but 
the smallest possible semantic components are 
uniquely labeled. 

The Verbmobil system is a translation system 
built by some 30 different groups in three coun- 
tries. The semantic formalism used on the En- 
glish generation side has been developed by CSLI, 
Stanford and is called MRS, Minimal Recursion 
Semantics (Copestakc ct al., 1995). The deep- 



level syntactic and semantic German processing of 
Verbmobil is also done along two parallel paths. 
The other path is developed by IBM, Heidelberg 
and uses a variant of MRS, Underspecified Min- 



imal Recursion Semantics (UMRS) (Egg and Le- 
beth, 1995|). AU the three formahsms LUD, MRS, 



and UMRS have in common that they use a flat, 
neo-Davidsonian representation and allow for the 
underspecification of functor-argument relations. 
In MRS, this is done by unification of the rela- 
tions with unresolved dependencies. This, how- 
ever, results in structures which cannot be further 
resolved. In UMRS this is modified by expressing 
the scoping possibilities directly as disjunctions. 
The main difference between both types of MRSs 
and LUD is that the interpretation of LUD in 
an object language other than ordinary predicate 
logic is well defined, as described in Section 3.2. 
The translation task of the SICS-SRI Bilin- 



gual Conversation Interpreter, BCI (Alshawi et 



representations is referred to ( Bos, 1995| ) 



al., 1991 ) is quite similar to that of Verbmobil. 
The BCI does translation at the level of Quasi- 
Logical Form, QLF which also is a monotonic 
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Figure 1: Lexical entries and a sample derivation in LUD 



representation langu age for compositional seman - 
tics as discussed in ( Alshawi and Crouch, 1992 ). 
The QLF formalism incorporates a Davidsonian 
approach to semantics, containing underspecified 
quantifiers and operators, as well as 'anaphoric 
terms' which stand for entities and relations to be 
determined by reference resolution. In these re- 
spects, the basic ideas of the QLF formalism are 
quite similar to LUD. 

5 Syntax— Semantics Interface and 
Implementation 

5.1 Grammar 

The LUD semantic construction component has 
been implemented in the grammar formalism 



TUG, Trace and Unification Grammar (Block and 
Schachtl, 1992), in a system called TrUG (in coop- 
eration with Siemens AG, Munich, who provided 
the German syntax and the TrUG system) . TUG 
is a formalism that combines ideas from Gov- 
ernment and Binding theory, namely the use of 
traces, with unification in order to account for, for 
example, the free word order phenomena found in 
German. 

5.1.1 Syntax and Semantics 

A TUG grammar basically consists of PATR-II 
style context free rules with feature annotations. 
Each syntactic rule gets annotated with a seman- 
tic counterpart. In this way, syntactic derivation 
and semantic construction are fully interleaved 
and semantics can further constrain the possible 
readings of the input. 

In order to make our formalisation executable, 



we employ the TrUG system, which compiles our 
rules into an efficient Tomita-style parser. In addi- 
tion TrUG incorporates sortal information, which 
is used to rank parsing results. 

Gonsider a simplified example of a syntactic rule 
annotated with a semantic functor-argument ap- 
plication. 



vp I 
vp : agr , 
lud_f un_arg(s , vp ,np) 



> np 

np : agr 



In this example, a sentence s consists of an np 
and a vp. The first feature equation annotated to 
this rule says that the value of the feature agr (for 
agreement) of the np equals that of the respective 
feature value of the vp. 

5.1.2 The Composition Process 

A category symbol like np in the rule above also 
stands for the entry node of its associated feature 
structure. This property is used for the seman- 
tic counterpart of the rule: lud_f un_arg is a call 
to a semantic rule, a macro in the TUG nota- 
tion, which defines functor-argument application. 
Since the macro gets the entry nodes of the fea- 
ture structures as arguments, all the information 
present in the feature structures can be accessed 
within the macro which is defined as 

lud_fun_arg (Result , Fun, Arg) => 
lud_context_equal (Fun, Result) , 
context(Fun,FunContext) , 
context(Arg,ArgContext) , 
subcat (Result ,ResultSc) , 
subcat (Fun, [ArgContext I ResultSc] ) . 



The functor-argument application is based on 
the notion of the context of a LUD-representation. 
The context of a LUD-representation is a 
three-place structure consisting of the LUD- 
representation's main label and top hole (as de- 
scribed in Section |3.3| ) and its main instance, 
which is a discourse marker or a lambda-bound 
variable. A LUD-representation also has a seman- 
tic subcategorization list under the feature subcat 
which performs the same function as a A-prefix. 
This list consists of the contexts of the arguments 
a category is looking for. 

The functor-argument application macro thus 
says the following. The context of the result is 
the context of the functor. The functor is look- 
ing for the argument as the first element on its 
subcat list, while the result's subcat list is that of 
the functor minus the argument (which has been 
bound in the rule). The binding of variables be- 
tween functor and argument takes place via the 
subcat list, through which a functor can access 
the main instance and the main label of its argu- 
ments and state relations between them. 

Note that the only relevant piece of informa- 
tion contained in a LUD-representation for the 
purpose of composition is its context. Its content 
in terms of semantic predicates is handled differ- 
ently. The predicates of a LUD-representation are 
stored in a special slot provided for each category 
by the TrUG system. The contents of this slot 
is handed up the tree from the daughters to the 
mother completely monotonically. So the predi- 
cates introduced by some lexical entry percolate 
up to the topmost node automatically. 

These two restrictions, the use of only a LUD- 
representation's context in composition and the 
monotonic percolation of semantic predicates up 
the tree, make the system completely composi- 
tional in the sense defined in Section 

5.1.3 The lexicon 

To see how the composition interacts with the 
lexicon, consider the following lexical macro defin- 
ing the semantics of a transitive verb 

trajis_verb_sem(Cat,Rel, [Rolel ,Role2] ) => 
basic_pred(Rel,Inst,Ll) , 
udef (Inst,L2) , 

group([Ll,L2,ArgLl,ArgL2] ,Main) , 
leq(Main,Top) , 

lud_context (Cat , Inst , Main, Top) . 
roleClnst, Rolel, Argl.ArgLl) , 
role (Inst, Role2,Arg2,ArgL2) , 
subcat (Cat , [lud(Argl , _ , _) , 

lud(Arg2,_,_)]) . 



The macro states that a transitive verb in- 
troduces a basic predicate of a certain relation 
with an instance and a label. The instance is 
related to its two arguments by argument roles. 
The arguments' instances are accessed via the 
verb's subcat list (and get bound during functor- 
argument application, cf. above). The labels in- 
troduced are grouped together; the group label is 
the main label of the LUD-representation, the in- 
stance its main instance. Another property of the 
verb's semantics is that it introduces the top hole 
of the sentence. 

5.2 Interfaces to Other Components 

As sketched in Section |^, our semantic construc- 
tion component delivers output to the components 
for semantic evaluation and transfer. The para- 
graphs that follow describe the common interface 
to these two components. 

5.2.1 Resolution of Underspecification 

Generating a scopally resolved LUD-represen- 
tation from an underspecified one is the process 
which we referred to as plugging in Section 3.2 . 
It aims at making the possibly ambiguous se- 
mantics captured by a LUD unique. Obviously, 
purely mathematical approaches for transforming 
the partial ordering encoded in the leq constraints 
into a total ordering may yield many results. 

Fortunately, linguistic constraints allow us to 
reduce the effort that has to be put into the com- 
putation of pluggings. An example is the linguis- 
tic observation that a predicate that encodes sen- 
tence mood in many cases modifies all of the re- 
mainder of the proposition for a sentence. Thus, 
pluggings where the predicate for sentence mood 
is subject to a leq constraint should not be con- 
sidered. They would result in a resolved structure 
expressing that the mood-predicate does not have 
scope over the remaining proposition. This would 
be contrary to the linguistic observation. 

5.2.2 Supplementary Information 

As a supplement to semantic predicates, our 
output contains various kinds of additional infor- 
mation. This is caused by the overall architec- 
ture of the Verbmobil system which does not pro- 
vide for fully-interconnected components. There 
is, e.g., no direct connection 

between the speech recognizer and the compo- 
nent for semantic evaluation. Thus, our compo- 
nent has to pipe certain kinds of information (like 
prosodic values). Accordingly, our output consists 
of "Verbmobil Interface Terms" (VITs) , which dif- 
fer slightly from the LUD-terms described above 



mainly in that they include non-semantic infor- 
mation. 

5.3 Implementation Status 

Currently, the lexicon of the implemented system 
contains about 1400 entries (full forms) and the 
grammar consists of about 400 syntactic rules, 
of which about 200 constitute a subgrammar for 
temporal expressions. The system has been tested 
on three simplified dialogues from a corpus of spo- 
ken language appointment scheduling dialogues 
collected for the project and processes about 90% 
of the turns the syntax can deal with. 

The system is currently being extended to cover 
nine additional dialogues from the corpus com- 
pletely. The size of the lexicon will then be about 
2500 entries, which amounts to about 1700 lem- 
mata. 

6 Conclusions 

We have discussed the implementation of a com- 
positional semantics in the Verbmobil speech-to- 
speech translation system. The notions of mono- 
tonicity and underspecification were discussed 
and LUD, a description language for underspeci- 
fied discourse representation structures was intro- 
duced. As shown in Section ^, the LUD descrip- 
tion language has a well-defined interpretation in 
DRT. Differently from Reyle's UDRSs, however, 
LUD assigns labels to the minimal semantic ele- 
ment and may also be interpreted in other object 
languages than DRT. 

The key part of the paper, Section ||, showed 
how the linguistically sound LUD formalism has 
been properly implemented in a (near) real-time 
system. The implementation in Siemens' TUG 
grammar formalism was described together with 
the architecture of the entire semantic processing 
module of Verbmobil and its current coverage. 
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